APPENDIX B 



The attached Appendix is considered to be part of and included in the present Amendment. 



YACC example 
%token A 

%token BODY_ uibody %token IMG 

%token input_ button %token input _checkbox %token input_ radio %token input_ text 
%token SELECT _uiCOMBO %token SELECT _uilistbox %token SPAN 
%token SPAN_ uilabel %token TD 
%token TD _uibtn 

%token td _uibtntxt %token td _uimenubar %token TD UIMENUITEM 
%token TD uitoolbar %token TD UIPANEL 

%% 

1. body 

The symbol "body" is the starting symbol of the grammar and it refers to the BODY 
HTML tag. In fact the right hand side of the rule begins with the recognition of a 
BODY HTML tag, as the first DOM element that is analyzed is always this tag. 
Notice the use of error tokens throughout the rules denoting areas of the input that 
are ignored. The illustrated system takes advantage of the fact that the yacc 
generated LALR(l) parser uses an error handling and recovery mechanism that is 
suitable for ignoring certain parts of the input stream. 

: BODY UIBODY error TD UIMENUBAR menu TD UITOOLBAR toolbar TD UIPANEL panel 
The symbol stack of the parser either contains references to DOM elements in case of 
terminal symbols (tokens representing HTML tags, such as TD _UINHNUBAR above) or a 
pointer to the tree of generated Ul objects in case of non-terminal symbols (such as 
menu, toolbar and panel above). The step below concatenates two trees of Ul objects 
and puts the result in m_sResult, the output of the parser. Notice the fact that 
this action gets executed as a final step of the syntactical analysis, when this 
rule is reduced to the starting symbol. 

Here, the tokens based on TD tags are to partition the input stream. The first TD 
tag with the TD UIMENUBAR 1 classname attribute signals the beginning of the menu 
bar (menu). This is one example, where this grammar-based approach shows its 
strength: it is very easy to create rules that are applied locally (to a certain 
part of the input stream) and the context is always well defined and followed by the 
parser . 

CParserSymbol *t; 

m_sResult JoinAsSiblings(t = Joi nAsSi bl i ngs($4 , $6), $8); 

m_i Result = 1; 

$$ = NULL; 

} 

I error 
{ 

m_sResult = null; m_i Result = 0; 

$$ = NULL; 

} 

The following two groups of rules (menu, menu_item) recognize the menu (as shown on 
the top of the browser client area on the screenshot) . The first rule group makes 
processing of several menu items possible, it is one way to express iteration of 
items in the grammar, 
menu 

: menu_item 
{ 

$$ = $1; 
} 

| menu menu_item 

The created Ul objects must be joined in a tree, that is the purpose of the 
Joi nAsSi bl i ngs function. 

$$ = JoinAsSiblings($l, $2); 



menu_i tern 

: TD UIMENUITEM 

The constructor of the class CGeneri cObject is the place where the list of Ul object 



identifiers that correspond to the graphical elements displayed in the browser is 
generated. (See Figure 3, item 48). inside the constructor the parser is able to 
access various properties of a DOM element, in this case its innerText and 
clientRects properties, in this simple instance there is a one-to-one relationship 
between the DOM element (the TD tag that is the only symbol on the right side of the 
rule - this element of the symbol stack is accessed with the $1) and the Ul object. 

$$ = new CGenericObject(this, null, null, l"menuitem", l"role_ system 
menuitem" , 0, 0, 0, 0, -1, 1, $1); 
} 

| error 

Anything other than the above TD_UIMENUITEM tags are ignored by this error 

al ternati ve . 

{ 

$$ = NULL; 

} 

Iteration over toolbar buttons is done the same way as for menu items. 

tool bar 

: tool bar_i tern 

{ 

$$ = $1; 

} 

| toolbar tool bar_i tern 
{ 

$$ = JoinAsSiblings($l, $2); 



2. Tool bar_i tern 

The same logical UI abstraction (a button) might be implemented in many different 
ways even in one application. These alternatives must be present here in the 

?rammar. These rules also present examples of logical UI objects that are created 
rom more than one DOM element. The clientRects attribute is read from the SPAN tag 
while the logical name of the object might be found as the ALT attribute of the IMG 
tag (first alternative) or as innerText of the SPAN tag (second alternative). 
: TD UIBTN ' { ' SPAN '{' IMG '}' '}' 
{ 

$$ = new CGenericObject(this, null, null, l"pushbutton" , l"role_system_pushbutton" , 
-1, 0, 0, 0, -1, 2, $1, $5); 
CComVa riant vAlt; 

CComBSTR sAlt("alt"); ((CPSHTMLElement"')$5)->m_pElement->getAttribute(sAlt, 0, 
&vAl t) ; 

if (vAlt.Vt == VT _BSTR) 

((CBXObjectTree*)$$)->m OD.odName = 
: :SysAllocString(vAlt.bstrVal) ; 

| TD UIBTNTXT '{' SPAN '{' IMG 1}"}' 

This alternative only differs from the Is" in the classname of the SPAN element. 
These classnames are mapped to different tokens although they can be unified in the 
scanner/tokeni zes . This makes the parser more complex but makes the creation of UI 
objects simpler. 

$$ = new CGenericObject(this, null, null, l"pushbutton" , 
l"role_system_ pushbutton", 0, 0, 0, 0, -1, 2, $1, $5); 

} 

I error 
{ 

$$ = NULL; 

} 

Iteration over the rest of the controls is done the same way as for menu items. 

3. panel 

: panel_item 



{ 

$$ = $1; 
} 

| panel panel_item 

$$ = JoinAsSiblings($l, $2); 

The following groups of rules describe the various controls that can be found in the 
example application. 
4. Panel_item 
: A 

{ 

$$ = new CGenericObiect(this, null, null, l"link", l"role system link", 0, 0, 0, 0, 
-1, 1, $D; 

| lNPUT_button 

{ 

$$ = new CGenericObject(this, null, null, l"pushbutton" , l"role system pushbutton", 
-1, 0, 0, 0, -1, 1, $1); - 
CComVariant Walue; 

CComBSTR svalue("value") ; ((CPSHTMLElement*)$l)->m_pElement->getAttribute(sValue, 0, 
&walue) ; 

if (walue. vt == VT bstr) 

((CBXObjectTree*)$$)->m OD.odName = : :SysAllocString(Walue.bstrVai) ; 

I input checkbox span uilabel 
{ 

$$ = new CGenericObject(this, null, null, l"checkbutton" , l"role system 

CHECKBUTTON" , 1, 0, 0, 0, -1, 2, $1, $2); 
} 

| INPUT radio SPAN UILABEL 

$$ = new CGenericObject(this, null, null, l"radiobutton" , l"role system 

RADIOBUTTON" , 1, 0, 0, 0, -1, 2, $1, $2); 
} 

panel_item_nl 

$$ = $1; 
} 

TD ' { ' SPAN_UILABEL '{' I}! '}' TD if panel_itemnl 

This rule assigns labels to controls. panelitem_nl is a non-terminal which means 
that the symbol stack contains an already recognized control's object at the time of 
the execution of the action. The action itself copies the innerText of the second 
SPAN tag to the name member of the control's UI object. 

((CPSHTMLElement-0 $3 )->m_pElement 

>get_i nnerText ( SE ( ( CBXObj ectTree*) $9 ) - >m_OD . odName) ; 
$$ = $9; 
} 

| error 

This alternative makes the parser to ignore any unimportant input. Any sequence of 
tokens that are different from the previous alternatives are skipped. 

$$ = NULL; 

} 

panel_i tem_nl 
: INPUT -text 
{ 

$$ = new CGenericObject(this, null, null, l"text", 
l"role system text", -1, 0, 0, 0, -1, 1, $1); 

} 

I SELECT_UICOMBO 



{ 

$$ = new CGenericObject(this, null, null, l"combobox", l"role system combobox" , -1, 
0, 0, 0, -1, 1, $1); 
} 

| SELECT_UILISTBOX 
{ 

$$ = new CGenericObject(this, null, null, l"list", 
l"role system list", -1, 0, 0, 0, -1, 1, $1); 

} 

%% 

C++ helper classes 

typedef struct tagSBXObjectDescri ptor 

BSTR odclass; 

BSTR odLogical Class ; 

BSTR odName; 

BSTR odLabel ; 

DWORD odstyle; 

DWORD odstatus; 

DWORD odFlags; 

ULONG uRectCount; 

[size _i s(uRectCount)] 

RECT *odRect; 

BSTR odContents; 

} SBXObjectDescriptor ; 

class CParserSymbol 

public: 

CParserSymbol () ; 

virtual -CParserSymbol () ; 

class CPSHTMLElement : public CParserSymbol 
{ 

public: 

CPSHTMLElement(const CHTMLEl ementPtr&, int iToken); 

-CPSHTMLElementO ; 

CHTMLEl ementPtr m_pElement; 

int m_i Token; 

}; 

class CBXObjectDescri ptor : public SBXObjectDescriptor 
public: 

CBXObjectDescri ptor () ; 
-CBXObjectDescri ptor () ; 

SBXObjectDescriptor *pSBXObjectDescri ptor) ; 
class CBXObjectTree : public CParserSymbol 
public: 

CBXObjectTree(CBXObjectTree*, CBXObjectTree*) ; 
-CBXObjectTreeO ; 
CBXObjectTree *m_pChild; 
CBXObjectTree *m_pSi bl i ng ; 
CBXObjectDescri ptor m_OD; 

class CGenericObject : public CBXObjectTree 

public: 

CGenericObject(yyf parser* , CBXObjectTree* , CBXObjectTree*) ; 

CGenericObject(yyf parser*, CBXObjectTree*, CBXObjectTree*, lpcwstr szClass, lpcwstr 
szLogClass, LONG uName, DWORD dwStatus, LONG uStartRect, LONG uEndRect, LONG 
uContents, ULONG uParamCount, ...); 



-CGenericObjectO ; 

void FillRectArrayCconst CHTMLEl ement2Ptr &pElement2, const POINT &pt) ; 
CElementArray m_aParams; 

CDWordArray m_aParamTokens ; LONG m_. uStartRect; 
LONG m_ uEndRect; 
LONG m_uContents; 
CDWord2PtrMap m_mEl ementlndex ; 

void Construct(LPCWSTR szClass, LPCWSTR szLogClass, LONG uName, DWORD dwStatus, LONG 
uStartRect, LONG uEndRect, LONG uContents, ULONG uParamCount, va_list) ; 
} ; 

CParserSymbol *Zloi nAsSi bl i ngs(CParserSymbol *&pl , CParserSymbol *&p2) 

CBXObjectTree *pTreel = (CBXObjectTree*)pl ; 
if (pi && p2) { 

while (pTreel ->m_pSi bl i ng) { 
pTreel = pTreel ->m_pSi bl i ng ; 

pTreel ->m_pSi bl i ng = (CBXObjectTree*)p2 ; 

CParserSymbol *r = pi ? pi : p2 ; 
pi = NULL; p2 = NULL; return r; 
} 

CParserSymbol *Joi nAsDescendants(CParserSymbol *&pl , 
CParserSymbol *&p2) 

CBXObjectTree -'pTreel = (CBXObjectTree*)pl ; 
if (pi && p2) { 

while (pTreel ->m_pChi Id) { 
pTreel = pTreel ->m_pChi 1 d ; 

} 

pTreel ->m_pChi Id = (CBXObjectTree*) p2 ; 

CParserSymbol *r = pi ? pi : p2 ; 
pi = NULL; 
p2 = NULL; 
return r; 

CParserSymbol : :CParserSymbol () { 

CParserSymbol : : -CParserSymbol () 

CPSHTMLElement: :CPSHTMLElement(const CHTMLEl ementPtr &pElement, int iToken) 
pElement(pElement) , m_i Token (iToken) 

CPSHTMLEl ement : : -CPSHTMLEl ement () 

CBXObjectDescri ptor : :CBXObjectDescri ptor () 

zeroMemory(thi s , si zeof (SBXObjectDescri ptor)) ; 
CBXObjectDescri ptor : : -CBXObjectDescri ptor () 

f (odclass) 

: :SysFreeStri ng (odclass); 
f (odLogicalClass) 

: :Sys Freest ri ng (odLogicalClass) ; 
f (odName) 

: :SysFreeStri ng(odName) ; 
f (odLabel) 

: :SysFreeStri ng(odLabel) ; 



if (odContents) 

: :SysFreeStri ng(odContents) ; if (odRect) 
delete odRect; 
} 

CBXObjectTree: :CBXObjectTree(CBXObjectTree *pChild, CBXObjectTree --'pSibling) 
: m_pChild(pChild) , mpSi bl i ng(pSi bl i ng) 

} 

CBXObjectTree: : -CBXObjectTreeO 

if (m_pchild) 

delete m_pChild; 
if (m pSibling) 

delete m_pSibling; 

CGenericObject: :CGenericObject(yyfparser --'pParser, CBXObjectTree -'pChild, 
CBXObjectTree -'pSibling) 

: CBXObjectTree(pChild, pSibling), m_uStartRect(0) , m_uEndRect(- 1), m_uContents(-l) 



CGenericObject: :CGenericObject(yyfparser --'pParser, CBXObjectTree -'pChild, 
CBXObjectTree -'pSibling, lpcwstr szClass, lpcwstr szLogClass, LONG uName, DWORD 
dwStatus, LONG uStartRect, LONG uEndRect, LONG uContents, ULONG uParamCount, ...) 
: CBXObjectTree(pChild, pSibling), m_uStartRect(0) , m_uEndRect(- 1), m uContents(-l) 

va_list argList; 

va start (argList, uParamCount) ; 

Construct(szClass , szLogClass, uName, dwStatus, uStartRect, uEndRect, uContents, 
uParamCount , argLi st) ; 
va end(argLi st) ; 

CGenericObject : : -CGeneri cObjectO 
} 

void CGenericObject: :Construct(LPCWSTR szClass, lpcwstr szLogClass, LONG uName, 
DWORD dwStatus, LONG uStartRect, LONG UEndRect, LONG uContents, ULONG uParamCount, 
va_list argList) 

m_uStartRect = uStartRect; 

m_uEndRect = uEndRect; 
m_uContents = uContents; 
CParserSymbol *pEl ement; 
if (uParamCount) 

for (ULONG i = 0; i < uParamCount; i++) 
{ 

pElement = va arg(argList, CParserSymbol-'); 

m_ aParams . pusTI _back(((CPSHTMLElement*)pElement)- >m_pEl ement) ; 
m_aParamTokens . push_back(((CPSHTMLEl ement-') pEl ement) - 
>m iToken) ; 

CUnknownPtr pElementUnk(((CPSHTMLElement*)pElement)- >m_pElement) ; 
m_ mElementIndex.insert(CDWord2PtrMap: :val ue_type( (DWORD) (lUnkn own--')pElementUnk, 
(LPVOID)i)) ; 

} 

} 

m_OD.odClass = : : SysAl locStri ng(szCl ass) ; 
m_OD.odLogical Class = : :SysAllocString(szLogClass) ; 
m_OD.odStatus = dwStatus; 
if (uParamCount) 

if (uName >=. 0) { 

m aParams [uName] ->get innerText(&m OD.odName); 
CropWhitespace(m OD.oaName); 



} 

} 

if (m_OD.odRect) 

delete m_ D.odRect; 
m_OD.odRect = NULL; 
m_OD.uRectCount = 0; 

for (LONG i = m_uStartRect ; i <= m_uEndRect; 

FillRectArray(CHTMLElement2Ptr(m_aParams[i]) , ptOffset) ; 

void CGenericObject: : Fi 1 1 RectArray(const CHTMLEl ement2Ptr &pElement2, const POINT 
&ptOffset) 

if (!pElement2) { 
return ; 
} 

CHTMLRectCol lection Ptr pRectCol lection ; 
pEl ement2->getcl i entRects (&pRectCol 1 ecti on) ; 

if (pRectCollection) { 
ULONG uRectCount = m OD . uRectCount ; 
pRectCol lection->get 1 ength((long*)&m_OD. uRectCount) ; 
m_OD. uRectCount += uRectCount; 
RECT --'odRect = m_OD.odRect; 
M_OD.odRect = new RECT[m_OD. uRectCount] ; 
: :ZeroMemory(m_OD. odRect, si zeof (RECT) *m_OD. uRectCount) ; 
if (odRect) { 

: : CopyMemory (m_OD . odRect , odRect , 
si zeof (RECT) -'uRectCount) ; 
delete odRect; 
} 

for (long i = uRectCount; i < (long)m_OD. uRectCount; i++) 
CComVariant idx = i - (1 ong) uRectCount , rval ; 
pRectCollection->item(&idx, &rval) ; 
if (rval .vt == VT_ DISPATCH) { 

CHTMLRectPtr pRect = rval . pdi spVal ; 
if (pRect) { 

pRect->get_left(&m_OD. odRect [i] .left) ; 
pRect->get_top(&m_OD. odRect [i] .top) ; 
pRect->get_right(&m_OD.odRect[i] . right) ; 
pRect->get bottom(&m OD. odRect [i] .bottom) ; 

} } 
} 

} 

} 



